Creating an Order in Distributed Digital Libraries by Integrating Independent Self-Organizing Maps

نویسندگان

  • Andreas Rauber
  • Dieter Merkl
چکیده

Digital document libraries are an almost perfect application arena for un-supervised neural networks. This because many of the operations computers have to perform on text documents are classiication tasks based on \noisy" input patterns. The \noise" arises because of the known inaccuracy of mapping natural language to an indexing vocabulary representing the contents of the documents. A growing number of papers is dedicated to the usage of self-organizing maps to organize the contents of such digital libraries. These papers assume the central availability of the data; an assumption that is questionable given the massive amount of available information. In this paper we describe an approach for organizing distributed digital libraries based on a system of independent self-organizing maps each of which representing just a portion of the complete digital library. Furthermore, we argue in favor of integrating these independent maps in a hierarchical fashion, again by means of self-organizing maps. The integration is based on the trained low level maps.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Organization of Distributed Digital Libraries: A Neural Network { Based Approach

Self-organizing maps are a popular neural network model for mapping high-dimensional input data onto a lower-dimensional output space. However, as the size of the training data increases, both the necessary computational power as well as the training time required exceed tolerable limits. Still more important, not all training data may be available in one central location but may rather be coll...

متن کامل

SOMLib: A Distributed Digital Library System based on Self-Organizing Maps

We describe an architecture for a distributed digital library system based on an unsupervised neural network model, namely the Self-Organizing Map. The system allows the clustering of text documents forming the basis for intelligent information retrieval. User prooles can be combined with full text queries or sample texts to locate documents within the library system. Contrary to conventional a...

متن کامل

MinervaDL: An Architecture for Information Retrieval and Filtering in Distributed Digital Libraries

We present MinervaDL, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of MinervaDL is based on the peer-to-peer search engine Minerva, and is able to handle huge amounts of data provided by digital libraries in a distributed and self-organizing way. The two-tier architecture and the us...

متن کامل

A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation

The rapid proliferation of textual and multimedia online databases, digital libraries, Internet servers, and intranet services has turned researchers' and practitioners' dream of creating an information-rich society into a nightmare of info-gluts. Many researchers believe that turning an info-glut into a useful digital library requires automated techniques for organizing and categorizing large-...

متن کامل

Text Data Mining

Classiication is one of the central issues in any system dealing with text data. The need for eeective approaches is dramatically increased nowadays due to the advent of massive digital libraries containing free-form documents. What we are looking for are powerful methods for the exploration of such libraries whereby the discovery of similarities between groups of text documents is the overall ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998